[WIP][API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_with_index) #74512

Enigmatisms · 2025-08-09T17:21:19Z

PR Category

Operator Mechanism

PR Types

New features

Description

本 PR 为 #74495 的 reopen 版本，rebase 到了一个更新的版本，并且解决了与 #74506 的冲突。#74495 当时只用于CI发现问题，本 PR 尝试对其中“共用amin/amax backward op 但amin/amax 不支持某些整数类型“的问题进行了修复，基于SFINAE与python端检查。目前本 PR 在 #74506 未合入前会显得改动过多，实际上是包含了部分前序 PR 的改动，前序 PR merge 后应该可以自动 resolve。

本 PR 尚未完成：缺少对应的单测（进行了测试，见最后的TODO），并且依赖一个前置 PR（前置PR目前没有合入，合入后本PR信息将会修改）： #74446，需要其中的 ForbidKeywordsDecorator 装饰器。

本 PR 新增的 feature:

两个新的 PHI kernel：(min/max)_with_index，底层基于 cub 的 Argmin/Argmax 操作，同时 reduce key/value。此 PHI kernel 仅在 GPU 端进行了支持。评估：
- CPU 端的 argmin/argmax 操作基于 Eigen::TensorMap 提供的 argmin/argmax 操作，无法对 value 同时进行 reduce。
- 其他后端暂不考虑支持。
新的 PHI kernel 的反向传播算子：(min/max)_with_index_grad。注意⚠️：此 backward 行为与 torch documentation 不一致！原因：torch 的文档错了，见我给 PyTorch 提的这个 issue：Torch Issue 160273，其新闻根本不是 amin/amax 的行为：只对minimum/maximum index位置的结果传梯度，像是对梯度进行了一个 take_along_axis。
其他与新算子相关的支持（如InferMeta/InferMetaSymbolic）
两个新增的操作：paddle.compat.min, paddle.compat.max，与 torch 的行为进行对齐。

torch.min/torch.max 输入输出关系很复杂（一个API包含了太多功能）：

单输入时是全 reduce：仅输出 tensor
输入 dim/keepdim 时：输出 values/indices
输入 other 时：行为与 minimum/maximum 一致

除上述【情况2】在 CUDA GPU 后端下会调用 (min/max)_with_index，其余情况都是由 python 调用 _C_ops.xxx 获得结果的。其中情况1/2/3 在CUDA GPU后端下应该都具有较好的性能（没有进行组合，调用单算子完成），而情况2在其他后端下使用 argmin/max 与 take_along_axis 组合（并且需要配合 squeeze_ 操作），不是最优性能方案，但应当具有较高的开发性价比。

TODO

本地 25个简单单测（单算子不同的输入组合）
PaConvert 库测试（10/14），4个失败的测试由于 out 机制没有进行开发导致。此问题暂时不由本人解决。
PHI kernel 相关的单测（比如静态图 OpPass 单测等等）
test_compat_minmax.py: 能达到单测覆盖率要求的算子单测。
算子python函数内英文文档

Pcard-89620

paddle-bot · 2025-08-09T17:21:25Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Enigmatisms · 2025-08-11T01:00:18Z

/re-run all-failed

…th_index)

Attempting to fix integral type gradient computation (rejection)

…e#74479)

…lePaddle#74519)

…lePaddle#74520)

…e#74409) * fix the grad clip performance * add test * empty commit to rerun CI * modify the note * Simplify code logic

) * Fix * Fix * Fix

)

…4516) * Fix * Fix * Fix

…og2BitOp. (PaddlePaddle#74292)

* dygraph support input a out Tensor * refine * refine * refine * refine * refine * refine * refine * refine

…ues Between CPU and GPU (PaddlePaddle#74222) * fix issue 73692 * fix error

* correct copysign backward * correct codestyle

…dle#74471) * Enhanced fused_transpose_split_quant with fp8 capability. * optimize performance. * Clean comment * clean miscs * Fix example

…[fluid_ops] (PaddlePaddle#74507) * Fix * Fix

…erged.

Enigmatisms force-pushed the min_max_api branch from ec7ad13 to 8b26814 Compare August 11, 2025 02:01

Enigmatisms added 15 commits August 11, 2025 02:14

[API-Compat] paddle.compat.split is added and tested

255f8f1

[API-Compat] paddle.compat.split is rigorously tested

95aedbe

[API-Compat] Fixed erroneous func help doc

6166b8d

[API-Compat] Make the forbid_keywords decorator transparent

41a0775

[API-Compat] Fixed decorator str input

c8930f8

[API-Compat] Fixed type annotation and removed legacy graph branch

d66c5f1

[API-Compat] More unittest & static graph check & updated decorator

b40feed

[API-Compat] Force update (local and not reproduce the bug)

4eb4925

[API-Compat] Removed unittest that paddle.split will also fail

48f5bb0

[API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_wi…

933c7c0

…th_index)

[API-Compat] Add compat.min/max EN doc

93d7c86

Attempting to fix integral type gradient computation (rejection)

[WIP][API-Compat] Add dyna-graph unittests for min/max

8ebe825

[WIP][API-Compat] Fixed CPU failure

3d42943

[API-Compat] Correct min/max_with index gradient behavior

ff38ddb

[API-Compat] XPU fix (attempt)

09aeb0d

Enigmatisms force-pushed the min_max_api branch from 8b26814 to 09aeb0d Compare August 11, 2025 02:26

Enigmatisms and others added 11 commits August 11, 2025 02:35

[API-Compat] Updated ForbidKeywordsDecorator

2f77d94

Rename ctx to dev_ctx in paddle/phi/kernels/ [fluid_ops] (PaddlePaddl…

e05a82d

…e#74479)

refine some error message to avoid linking words together part7 (Padd…

5ed7519

…lePaddle#74519)

refine some error message to avoid linking words together part6 (Padd…

73b235f

…lePaddle#74520)

[AutoParallel] fix the grad_clip logic of auto_hybrid_pp (PaddlePaddl…

580d4a9

…e#74409) * fix the grad clip performance * add test * empty commit to rerun CI * modify the note * Simplify code logic

test/cpp rename use_mkldnn (PaddlePaddle#74501)

9076968

test/ directory modify use_mkldnn [fluid_ops] - part (PaddlePaddle#74487

fc98858

) * Fix * Fix * Fix

test/deprecated/cpp modify use_mkldnn [fluid_ops] (PaddlePaddle#74502)

e237e01

rename ctx to dev_ctx,xpu_ctx (PaddlePaddle#74513)

4a9c975

is_test_pass_tester.cc modify use_mkldnn [fluid_ops] (PaddlePaddle#74518

6b71a4b

)

create_inference_config modify use_mkldnn [fluid_ops] (PaddlePaddle#7…

f962b15

…4516) * Fix * Fix * Fix

ooooo-create and others added 14 commits August 11, 2025 12:57

fix typos (PaddlePaddle#74497)

157bbb2

[xpu] fix compile (PaddlePaddle#74492)

c55f40b

CINN Add more Simplify scenario: Select2MinMax, BoundSimplify, PowerL…

026dd5f

…og2BitOp. (PaddlePaddle#74292)

dygraph support input a out Tensor (PaddlePaddle#74484)

c8f5638

* dygraph support input a out Tensor * refine * refine * refine * refine * refine * refine * refine * refine

[Fix Issue] paddle.unique Exhibits Inconsistent Behavior with NaN Val…

07368c2

…ues Between CPU and GPU (PaddlePaddle#74222) * fix issue 73692 * fix error

correct copysign backward (PaddlePaddle#74322)

cd23a02

* correct copysign backward * correct codestyle

Enhancing fused_transpose_split_quant with fp8 capability. (PaddlePad…

c91fe43

…dle#74471) * Enhanced fused_transpose_split_quant with fp8 capability. * optimize performance. * Clean comment * clean miscs * Fix example

test_mkldnn_matmul_v2_elementwise_add_fuse_pass.py modify use_mkldnn …

fbd9f59

…[fluid_ops] (PaddlePaddle#74507) * Fix * Fix

some create api support more usage (PaddlePaddle#74494)

17b20dd

[API-Compat] More unittest & static graph check & updated decorator

761cd99

[API-Compat] Updated ForbidKeywordsDecorator

bf0b293

[API-Compat] Static Graph and CPU end debug

1514511

[API-Compat] Revert erroneous rebase

c55dc29

[API-Compat] Removed one split unittest, since the former PR is not m…

9f7d036

…erged.

Enigmatisms requested review from SigureMo, DrRyanHuang, zrr1999 and gouzil as code owners August 11, 2025 13:03

Enigmatisms closed this Aug 11, 2025

Enigmatisms deleted the min_max_api branch August 29, 2025 05:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_with_index) #74512

[WIP][API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_with_index) #74512

Uh oh!

Enigmatisms commented Aug 9, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Aug 9, 2025

Uh oh!

Enigmatisms commented Aug 11, 2025

Uh oh!

Uh oh!

[WIP][API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_with_index) #74512

[WIP][API-Compat] Add paddle.compat.min/max and new PHI kernel (min/max_with_index) #74512

Uh oh!

Conversation

Enigmatisms commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Category

PR Types

Description

TODO

Uh oh!

paddle-bot bot commented Aug 9, 2025

Uh oh!

Enigmatisms commented Aug 11, 2025

Uh oh!

Uh oh!

Enigmatisms commented Aug 9, 2025 •

edited

Loading